Exercise 05.1 (random numbers)


In [2]:
from random import randint

def roll(n):
    "Roll of an n-sided dice"
    return randint(1, n)

In [3]:
def test_six_sided_dice(n_rolls):
    "Emulates n rolls of a 6-sided dice and checks the frequency of each side"
    # Initialize a list to store the result of the roll
    rolls = [0, 0, 0, 0, 0, 0]
    
    # Roll the dice and store the result
    for i in range(n_rolls):
        rolls[roll(6)-1] += 1
    
    # Calculate the frequency of each side
    for i in range(6):
        rolls[i] = rolls[i] / n_rolls
    
    return rolls

# Emulate 100000 rolls
test_six_sided_dice(100000)


Out[3]:
[0.164, 0.1683, 0.1659, 0.16955, 0.16619, 0.16606]

Exercise 05.2 (data compression)

For devices with limited memory, data compression can be important. Data compression is a field of its own, but with libraries we can compress (and uncompress) data easily.

Below is a program code for compressing a passage from Hamlet, by Shakespeare.


In [4]:
# Import the compression module
import zlib

# Create a string that we wish to compress
text = """
Welcome, dear Rosencrantz and Guildenstern!
Moreover that we much did long to see you,
The need we have to use you did provoke
Our hasty sending. Something have you heard
Of Hamlet's transformation; so call it,
Sith nor the exterior nor the inward man
Resembles that it was. What it should be,
More than his father's death, that thus hath put him
So much from the understanding of himself,
I cannot dream of: I entreat you both,
That, being of so young days brought up with him,
And sith so neighbour'd to his youth and havior,
That you vouchsafe your rest here in our court
Some little time: so by your companies
To draw him on to pleasures, and to gather,
So much as from occasion you may glean,
Whether aught, to us unknown, afflicts him thus,
That, open'd, lies within our remedy."""

# Convert Python string to bytes, and check type
text_bytes = text.encode("utf-8")
print(type(text_bytes))

# Get number of bytes (memory) used to store string
print("Number of bytes for uncompressed string:", len(text_bytes))

# Compress string and get number of byes used for compressed string
text_comp = zlib.compress(text_bytes)
print("Number of bytes for compressed string:", len(text_comp))

# Display the compression efficiency
print("Compression efficiency: ", len(text_comp)/len(text_bytes))

# Decompress the string
text_decomp = zlib.decompress(text_comp)

# Check that original and decompressed strings are the same
if text == text_decomp.decode("utf-8"):
    print("All good: original and decompressed strings are the same.")
else:
    print("Problem: original and decompressed strings differ.")


<class 'bytes'>
Number of bytes for uncompressed string: 785
Number of bytes for compressed string: 466
Compression efficiency:  0.5936305732484076
All good: original and decompressed strings are the same.

Using the above as a guide, examine the compression efficiency of

  1. Compressing one large string made up of the passage by Shakespeare repeated 100 times; and
  2. Compressing a random string of the same length as the repeated Shakespeare passage.

To help you, the below function generates a random string of length N:


In [5]:
import random
import string

def random_string(N):
    return ''.join([random.choice(string.ascii_letters + string.digits) for n in range(N)])

print(random_string(8))


bTzGRl7L

In [6]:
def compression_efficiency(string):
    "Calculate the compression efficiency obtained applying zlib.compress"
    # Convert Python string to bytes
    text_bytes = string.encode('utf-8')

    # Compress string
    text_comp = zlib.compress(text_bytes)

    # Calculate the compression efficiency
    return len(text_comp)/len(text_bytes)

# Comparison of compression efficiency (the smaller the better)
print('Compression efficiency for Shakespeare\'s passage repeated 100 times:', compression_efficiency(text*100))
print('Compression efficiency for a random string of the same length:', compression_efficiency(random_string(len(text)*100)))


Compression efficiency for Shakespeare's passage repeated 100 times: 0.01178343949044586
Compression efficiency for a random string of the same length: 0.7519235668789809